Some Properties of Generalized Fused Lasso and Its Applications to High Dimensional Data
نویسندگان
چکیده
Identifying homogeneous subgroups of variables can be challenging in high dimensional data analysis with highly correlated predictors. The generalized fused lasso has been proposed to simultaneously select correlated variables and identify them as predictive clusters. In this article, we study several properties of generalized fused lasso. First, we present a geometric interpretation of the generalized fused lasso along with a discussion of its persistency. Second, we analytically show its grouping property. Third, we introduce a modified version of the generalized fused lasso and perform comprehensive simulation studies to compare our version of the generalized fused lasso with other existing methods, showing that the generalized fused lasso outperforms other variable selection methods in terms of prediction error and parsimony. We describe two applications of our method in soil science and near infrared spectroscopy studies. These examples having vastly different data types demonstrate the flexibility of the methodology particularly for high-dimensional data.
منابع مشابه
Adaptive Generalized Fused-Lasso: Asymptotic Properties and Applications
The Lasso has been widely studied and used in many applications over the last decade. It has also been extended in various directions in particular to ensure asymptotic oracle properties through adaptive weights (Zou, 2006). Another direction has been to incorporate additional knowledge within the penalty to account for some structure among features. Among such strategies the Fused-Lasso (Tibsh...
متن کاملSplit Bregman method for large scale fused Lasso
Abstract: Ordering of regression or classification coefficients occurs in many real-world applications. Fused Lasso exploits this ordering by explicitly regularizing the differences between neighboring coefficients through an l1 norm regularizer. However, due to nonseparability and nonsmoothness of the regularization term, solving the fused Lasso problem is computationally demanding. Existing s...
متن کاملAn Extended Generalized Lindley Distribution and Its Applications to Lifetime Data
In this paper, a four parameters extension of the generalized Lindley distribution is introduced. The new distribution includes the power Lindley, Lindley, generalized (Stacy) gamma, gamma, Weibull, Rayleigh, exponential and half-normal distribution. Several statistical properties of the distribution are explored. Then, a bivariate version of the proposed distribution is derived. Using a simula...
متن کاملEstimation and Selection via Absolute Penalized Convex Minimization And Its Multistage Adaptive Applications
The ℓ1-penalized method, or the Lasso, has emerged as an important tool for the analysis of large data sets. Many important results have been obtained for the Lasso in linear regression which have led to a deeper understanding of high-dimensional statistical problems. In this article, we consider a class of weighted ℓ1-penalized estimators for convex loss functions of a general form, including ...
متن کاملOn the robustness of the generalized fused lasso to prior specifications
Using networks as prior knowledge to guide model selection is a way to reach structured sparsity. In particular, the fused lasso that was originally designed to penalize differences of coefficients corresponding to successive features has been generalized to handle features whose effects are structured according to a given network. As any prior information, the network provided in the penalty m...
متن کامل